Goto

Collaborating Authors

 language learning


Studies with impossible languages falsify LMs as models of human language

Bowers, Jeffrey S., Mitchell, Jeff

arXiv.org Artificial Intelligence

Studies with impossible languages falsify LMs as models of human language Jeffrey S. Bowers, School of Psychology and Neuroscience, University of Bristol Jeff Mitchell, School of Engineering and Informatics, University of Sussex Commentary on Futrell, R., & Mahowald, K. (in press). How linguistics learned to stop worrying and love the language models. Abstract According to Futrell and Mahowald (F&M), both infants and language models (LMs) find attested languages easier to learn than "impossible languages" that have unnatural structures. We review the literature and show that LMs often learn attested and many impossible languages equally well. Difficult to learn impossible languages are simply more complex (or random).


LangLingual: A Personalised, Exercise-oriented English Language Learning Tool Leveraging Large Language Models

Gupta, Sammriddh, Singh, Sonit, Joshi, Aditya, Kim, Mira

arXiv.org Artificial Intelligence

Language educators strive to create a rich experience for learners, while they may be restricted in the extend of feedback and practice they can provide. We present the design and development of LangLingual, a conversational agent built using the LangChain framework and powered by Large Language Models. The system is specifically designed to provide real-time, grammar-focused feedback, generate context-aware language exercises and track learner proficiency over time. The paper discusses the architecture, implementation and evaluation of LangLingual in detail. The results indicate strong usability, positive learning outcomes and encouraging learner engagement.


Teacher Demonstrations in a BabyLM's Zone of Proximal Development for Contingent Multi-Turn Interaction

Salhan, Suchir, Gu, Hongyi, Rooein, Donya, Galvan-Sosa, Diana, Gaudeau, Gabrielle, Caines, Andrew, Yuan, Zheng, Buttery, Paula

arXiv.org Artificial Intelligence

Multi-turn dialogues between a child and a caregiver are characterized by a property called contingency - that is, prompt, direct, and meaningful exchanges between interlocutors. We introduce ContingentChat, a teacher-student framework that benchmarks and improves multi-turn contingency in a BabyLM trained on 100M words. Using a novel alignment dataset for post-training, BabyLM generates responses that are more grammatical and cohesive. Experiments with adaptive teacher decoding strategies show limited additional gains. ContingentChat demonstrates the benefits of targeted post-training for dialogue quality and indicates that contingency remains a challenging goal for BabyLMs.


Towards Understanding Ambiguity Resolution in Multimodal Inference of Meaning

Wang, Yufei, Kovashka, Adriana, Fernández, Loretta, Coutanche, Marc N., Wiener, Seth

arXiv.org Artificial Intelligence

We investigate a new setting for foreign language learning, where learners infer the meaning of unfamiliar words in a multimodal context of a sentence describing a paired image. We conduct studies with human participants using different image-text pairs. We analyze the features of the data (i.e., images and texts) that make it easier for participants to infer the meaning of a masked or unfamiliar word, and what language backgrounds of the participants correlate with success. We find only some intuitive features have strong correlations with participant performance, prompting the need for further investigating of predictive features for success in these tasks. We also analyze the ability of AI systems to reason about participant performance, and discover promising future directions for improving this reasoning ability.


Analyzing Information-Seeking Behaviors in a Hakka AI Chatbot: A Cognitive-Pragmatic Study

Lee, Chu-Hsuan, Chang, Chen-Chi, Lee, Hung-Shin, Hsu, Yun-Hsiang, Chen, Ching-Yuan

arXiv.org Artificial Intelligence

With many endangered languages at risk of disappearing, efforts to preserve them now rely more than ever on using technology alongside culturally informed teaching strategies. This study examines user behaviors in TALKA, a generative AI-powered chatbot designed for Hakka language engagement, by employing a dual-layered analytical framework grounded in Bloom's Taxonomy of cognitive processes and dialogue act categorization. We analyzed 7,077 user utterances, each carefully annotated according to six cognitive levels and eleven dialogue act types. These included a variety of functions, such as asking for information, requesting translations, making cultural inquiries, and using language creatively. Pragmatic classifications further highlight how different types of dialogue acts--such as feedback, control commands, and social greetings--align with specific cognitive intentions. The results suggest that generative AI chatbots can support language learning in meaningful ways--especially when they are designed with an understanding of how users think and communicate. They may also help learners express themselves more confidently and connect with their cultural identity. The TALKA case provides empirical insights into how AI-mediated dialogue facilitates cognitive development in low-resource language learners, as well as pragmatic negotiation and socio-cultural affiliation. By focusing on AI-assisted language learning, this study offers new insights into how technology can support language preservation and educational practice.


0c0a7566915f4f24853fc4192689aa7e-Reviews.html

Neural Information Processing Systems

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper presents a probabilistic model for language learning. The authors cover the nature in which a pair of cooperative agents may work together to create an agreed-upon language. One question I have is how this could possibly be implemented in real-world language learning situations. Your evaluation of the emergence of phenomenon seen in real world languages makes me think you are trying to model or learn something about what real world language evolution is like.


Feedback Indicators: The Alignment between Llama and a Teacher in Language Learning

Rüdian, Sylvio, Elsir, Yassin, Kretschmer, Marvin, Cayrou, Sabine, Pinkwart, Niels

arXiv.org Artificial Intelligence

Automated feedback generation has the potential to enhance students' learning progress by providing timely and targeted feedback. Moreover, it can assist teachers in optimizing their time, allowing them to focus on more strategic and personalized aspects of teaching. To generate high-quality, information-rich formative feedback, it is essential first to extract relevant indicators, as these serve as the foundation upon which the feedback is constructed. Teachers often employ feedback criteria grids composed of various indicators that they evaluate systematically. This study examines the initial phase of extracting such indicators from students' submissions of a language learning course using the large language model Llama 3.1. Accordingly, the alignment between indicators generated by the LLM and human ratings across various feedback criteria is investigated. The findings demonstrate statistically significant strong correlations, even in cases involving unanticipated combinations of indicators and criteria. The methodology employed in this paper offers a promising foundation for extracting indicators from students' submissions using LLMs. Such indicators can potentially be utilized to auto-generate explainable and transparent formative feedback in future research.


Human-like fleeting memory improves language learning but impairs reading time prediction in transformer language models

Thamma, Abishek, Heilbron, Micha

arXiv.org Artificial Intelligence

As words are processed, the exact wordforms that make up incoming sentences are rapidly lost. Cognitive scientists have long believed that this limitation of memory may, paradoxically, help in learning language - an idea supported by classic connectionist modelling work. The rise of Transformers appears to challenge this idea, as these models can learn language effectively, despite lacking memory limitations or other architectural recency biases. Here, we investigate the hypothesized benefit of fleeting memory for language learning in tightly controlled experiments on transformer language models. Training transformers with and without fleeting memory on a developmentally realistic training set, we find that fleeting memory consistently improves language learning (as quantified by both overall language modelling performance and targeted syntactic evaluation) but, unexpectedly, impairs surprisal-based prediction of human reading times. Interestingly, follow up analyses revealed that this discrepancy - better language modeling, yet worse reading time prediction - could not be accounted for by prior explanations of why better language models sometimes fit human reading time worse. Together, these results support a benefit of memory limitations on neural network language learning - but not on predicting behavior.


DIY-MKG: An LLM-Based Polyglot Language Learning System

Tang, Kenan, Li, Yanhong, Qin, Yao

arXiv.org Artificial Intelligence

Existing language learning tools, even those powered by Large Language Models (LLMs), often lack support for polyglot learners to build linguistic connections across vocabularies in multiple languages, provide limited customization for individual learning paces or needs, and suffer from detrimental cognitive offloading. To address these limitations, we design Do-It-Yourself Multilingual Knowledge Graph (DIY-MKG), an open-source system that supports polyglot language learning. DIY-MKG allows the user to build personalized vocabulary knowledge graphs, which are constructed by selective expansion with related words suggested by an LLM. The system further enhances learning through rich annotation capabilities and an adaptive review module that leverages LLMs for dynamic, personalized quiz generation. In addition, DIY-MKG allows users to flag incorrect quiz questions, simultaneously increasing user engagement and providing a feedback loop for prompt refinement. Our evaluation of LLM-based components in DIY-MKG shows that vocabulary expansion is reliable and fair across multiple languages, and that the generated quizzes are highly accurate, validating the robustness of DIY-MKG.


Development and Validation of Engagement and Rapport Scales for Evaluating User Experience in Multimodal Dialogue Systems

Kurata, Fuma, Saeki, Mao, Eguchi, Masaki, Suzuki, Shungo, Takatsu, Hiroaki, Matsuyama, Yoichi

arXiv.org Artificial Intelligence

This study aimed to develop and validate two scales of engagement and rapport to evaluate the user experience quality with multimodal dialogue systems in the context of foreign language learning. The scales were designed based on theories of engagement in educational psychology, social psychology, and second language acquisition.Seventy-four Japanese learners of English completed roleplay and discussion tasks with trained human tutors and a dialog agent. After each dialogic task was completed, they responded to the scales of engagement and rapport. The validity and reliability of the scales were investigated through two analyses. We first conducted analysis of Cronbach's alpha coefficient and a series of confirmatory factor analyses to test the structural validity of the scales and the reliability of our designed items. We then compared the scores of engagement and rapport between the dialogue with human tutors and the one with a dialogue agent. The results revealed that our scales succeeded in capturing the difference in the dialogue experience quality between the human interlocutors and the dialogue agent from multiple perspectives.